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Abstract 

The proportionate normalized least mean square (PNLMS) algorithm and its variants are by far the most popular adaptive filters 
that are used to identify sparse systems. The convergence speed of the PNLMS algorithm, though very high initially, however, 
slows down at a later stage, even becoming worse than sparsity agnostic adaptive filters like the NLMS. In this paper, we address 
this problem by introducing a carefully constructed h norm (of the coefficients) penalty in the PNLMS cost function which favors 
sparsity. This results in certain zero attractor terms in the PNLMS weight update equation which help in the shrinkage of the 
coefficients, especially the inactive taps, thereby arresting the slowing down of convergence and also producing lesser steady state 
excess mean square error (EMSE). We also carry out the convergence analysis (in mean) of the proposed algorithm. 

Index Terms 

Sparse Adaptive Filter, PNLMS Algorithm, RZA-NLMS algorithm, convergence speed, steady state performance. 


I. Introduction 

I N real life, there exist many examples of systems that have a sparse impulse response, having a few significant non-zero 
elements (called active taps) amidst several zero or insignificant elements (called inactive taps). One example of such 
systems is the network echo canceller ID- El, which uses both packet-switched and circuit-switched components and has a 
total echo response of about 64-128 ms duration out of which the “active” region spans a duration of only 8-12 ms, while the 
remaining “inactive” part accounts for bulk delay due to network loading, encoding and jitter buffer delays. Another example 
is the acoustic echo generated due to coupling between microphone and loudspeaker in hands free mobile telephony, where 
the sparsity of the acoustic channel impulse response varies with the loudspeaker-microphone distance 0. Other well known 
examples of sparse systems include HDTV where clusters of dominant echoes arrive after long periods of silence 0, wireless 
multipath channels which, on most of the occasions, have only a few clusters of significant paths 0, and underwater acoustic 
channels where the various multipath components caused by reflections off the sea surface and sea bed have long intermediate 
delays 0. The last decade witnessed a flurry of research activities m that sought to develop sparsity aware adaptive filters 
which can exploit the a priori knowledge of the sparseness of the system and thus enjoy improved identification performance. 
The first and foremost in this category is the proportionate normalized LMS (PNLMS) algorithm J8) which achieves faster 
initial convergence by deploying different step sizes for different weights, with each one made proportional to the magnitude of 
the corresponding weight estimate. The convergence rate of the PNLMS algorithm, however, slows down at a later stage of the 
iteration and becomes even worse than a sparsity agnostic algorithm like the NLMS 0. This problem was later addressed in 
several of its variants like the improved PNLMS (i.e. IPNLMS) algorithm ED, composite proportionate and normalized LMS 
(i.e. CPNLMS) algorithm [10] and mu law PNLMS (i.e. MPNLMS) algorithm fl~i~3l . These algorithms improve the transient 
response (i.e. convergence speed) of the PNLMS algorithm for identifying sparse systems. However, all of them yield almost 
same steady-state excess mean square error (EMSE) performance as produced by the PNLMS. The need to improve both 
transient and steady-state performance subsequently led to several variable step-size (VSS), proportionate type algorithms Da¬ 
na. 

In this paper, drawing ideas from E3- ED, we aim to improve the performance of the PNLMS algorithm further, by 
introducing a carefully constructed 1 i norm (of the coefficients) penalty in the PNLMS cost function which favors sparsity]). 
This results in a modified PNLMS update equation with a zero attractor for all the taps, named as the Zero-Attracting PNLMS 
(ZA-PNLMS) algorithm. The zero attractors help in the shrinkage of the coefficients which is particularly desirable for the 
inactive taps, thereby giving rise to lesser steady state EMSE for sparse systems. Further, by drawing the inactive taps towards 
zero, the zero attractors help in arresting the sluggishness of the convergence of the PNLMS algorithm that takes place at a 
later stage of the iteration, caused by the diminishing effective step sizes of the inactive taps. We show this by presenting 
a detailed convergence analysis of the proposed algorithm, which is, however, a very daunting task, especially due to the 
presence of a so-called gain matrix and also the zero attractors in the update equation. To overcome the challenges posed by 
them, we deploy a transform domain equivalent model of the proposed algorithm and separately, an elegant scheme of angular 
discretization of continuous valued random vectors proposed earlier in [22]. 
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II. Proposed Algorithm 


Consider a PNLMS based adaptive filter that takes x(n) as input and updates a L tap coefficient vector 

w(n) = [wo(n),wi(n), ■ ■ ■ ,w L -i(n)] T as (8), 


w (n + 1) = w(n) + 


pG(n)x(n)e(n) 
x T (n)G(n)x(n) + Sp ’ 


( 1 ) 


where x(n) = [x(n),x(n — 1), • • ■ ,x(n — L + 1)] T is the input regressor vector, G(n) is a diagonal matrix that modifies 
the step size of each tap, p is the overall step size, Sp is a regularization parameter and e(n) = d(n) — w T (n)x(n) is the 
filter output error, with d(n) denoting the so-called desired response. In the system identification problem under consideration, 
d(n) is the observed system output, given as d(n) = w^, t x(n) + v(n), where w op t is the system impulse response vector 
(supposed to be sparse), x(n) is the system input and v(n) is an observation noise which is assumed to be white with variance 
a' 2 and independent of x{rn) for all n and m. 

The matrix G(n) is evaluated as 

G(n) = diag(g 0 (n),g 1 (n)....g L - 1 (n)), (2) 


where. 


with 


9i(n) 


7 i{n) 

Et~c S*(») 


, 0 < l < L - 1, 


(3) 


7 i{n) = max \p g max[<5, | w 0 {n) |,.. | w L -i{n) |], 

I wi(n) |], 


(4) 


The parameter 6 is an initialization parameter that helps to prevent stalling of the weight updating at the initial stage when all 
the taps are initialized to zero. Similarly, if an individual tap weight becomes very small, to avoid stalling of the corresponding 
weight update recursion, the respective 72 (n) is taken as a small fraction (given by the constant p g ) of the largest tap 
magnitude. By providing separate effective step size pgi(n) to each Lth tap where gi{n) is broadly proportional to 1 1 / 22 ( 71 )!, 
the PNLMS algorithm achieves higher rate of convergence initially, caused primarily by the active taps. At a later stage, 
however, the convergence slows down considerably, being controlled primarily by the numerically dominant inactive taps that 
have progressively diminishing effective step sizes DU, 03- 

It has recently been shown ETI that the PNLMS weight update recursion (i.e., Eq. (1)) can be obtained by minimizing the 
cost function ||w(n + 1 ) — w(n)||Q_i^ subject to the condition d(n) — w T (n + l)x(n) = 0 (the notation || x ||^ indicates 
the generalized inner product x T Ax)). In order to derive the ZA-PNLMS algorithm, following IfTTI , we add an l \ norm penalty 
7 | |G~ 1 (n)w(n + 1)|| to the above cost function, where 7 is a very very small constant. Note that unlike 03, we have, 
however, used a generalized form of l\ norm penalty here which scales the elements of w(n + 1) by G _ 1 (n) first before taking 
the 1 1 norm (the above scaling makes the l \ norm penalty governed primarily by the inactive taps). The above constrained 
optimization problem may then be stated as,: 

min || w (n + 1) — w(n) ||q_i +7 || G _ 1 w(n + 1) ||i (5) 

w(n+l) 

subject to d(n) — w T (n + l)x(n) = 0, where the short form notation “G -1 ” is used to indicate G _1 (n). Using Lagrange 
multiplier A, this amounts to minimizing the cost function J(n + 1) =|| w (n + 1) — w (n) ||q-i +7 || G _1 w(n + 1) ||i 
+A (d(n) — w T (n + l)x(n)). Setting dJ(n + l)/9w (n + 1) = 0, one obtains, 

w (n + 1) = w(n) — \^sgn(yf{n + 1)) — AG(n)x(n)] ( 6 ) 


where sgn(.) is the well known signum function, i.e., sgn(x) = 1 (x > 0), 0 (x = 0), —1 (x < 0). Premultiplying both the 
LHS and the RHS of (6) by x T (n) and using the condition d(n) — w T (n + l)x(n) = 0, one obtains, 

= e(n) + 7X r (n)sffn(w(n + 1)) 

x T (n)G(n)x(n) 


Substituting (7) in (6), we have, 


w (n + 1) = w (n) + 


e(n)G(n)x(n) 

x T (n)G(n)x(n) 


I - 


c(n)x T (?r)G(n) 

c T (n)G(?r)x(n) 


sgn{w(n + 1)). 


( 8 ) 


Note that the above equation does not provide the desired weight update relation, as the R.H.S. contains the unknown term 
sgn(w(n + 1)). In order to obtain a feasible weight update equation, we approximate sgn(w(n +1)) by an estimate, namely, 
sgn( w(n)) which is known. This is based on the assumption that most of the weights do not undergo change of sign as they 









get updated. This assumption may not, however, appear to be a very accurate one, especially for the inactive taps that fluctuate 
around zero value in the steady state. Nevertheless, an analysis of the proposed algorithm, as given later in this paper, shows 


that this approximation does not have any serious effect on the convergence behavior of the proposed algorithm. Apart from 
this, we also observe that in (8), elements of the matrix ^ ave magnitudes much less than 1, especially for large 

order filters, and thus, this term can be neglected in comparison to I. 

From above and introducing the algorithm step size /i and a regularization parameter Sp in (8), for a large order adaptive 


filter, one then obtains the following weight update equation : 


w(n + 1) = w(n) + 


y,e(n)G(n)x(n) 
x T (n)G(n)x(n) + Sp 


psgn{vf[n)) 


(9) 


where p = p'y. 

Eq. (9) provides the weight update relation for the proposed ZA-PNLMS algorithm, where the second term on the R.H.S. 
is the usual PNLMS update term while the last term, i.e., psgn(yf(n )) is the so-called zero attractor. The zero attractor adds 
— psgn(iUj(n )) to Wj(n) and thus helps in its shrinkage to zero. Ideally, the zero attraction should be confined only to the 
inactive taps, which means that the proposed ZA-PNLMS algorithm will perform particularly well for systems that are highly 
sparse, but its performance may degrade as the number of active taps increases. In such cases, Eq. (9) may be further refined 
by applying the reweighting concept itTTl to it. For this, we replace the l\ regularization term || G^ 1 w(n + 1) ||i in (5) by 
a log-sum penalty X!f = i g .^ l ^ off(l+ | Wi(n + 1) | /e) where gi(n) is the i-th diagonal element of G(n) and e is a small 
constant. Following the same steps as used above to derive the ZA-PNLMS algorithm, one can then obtain the RZA-PNLMS 
weight update equation as given by 


Wi(n + 1 ) 


Wi(n) + 


pgi(n)x(n — i + l)e(n) 


x T (n)G(n)x(n) + Sp 
sgn(wi(n )) 


1 + £ | Wi(n) 


r, * = 0, 1, • • • ,L — 1, 


GO) 


where e = 1/e and p = pje. The last term of (10), named as reweighted zero attractor, provides a selective shrinkage to the 
taps. Due to this reweighted zero attractor, the inactive taps with zero magnitudes or magnitudes comparable to 1/e undergo 
higher shrinkage compared to the active taps which enhances the performance both in terms of convergence speed and steady 
state EMSE. 


III. Convergence Analysis of the Proposed ZA-PNLMS Algorithm 

A convergence analysis of the PNLMS algorithm is known to be a daunting task, due to the presence of G(n) both in the 
numerator and the denominator of the weight update term in (1), which again depends on w (n). The presence of the zero 
attractor term makes it further complicated for the proposed ZA-PNLMS algorithm, i.e., Eq. (9). To analyze the latter, we follow 
here an approach adopted recently in lf26l in the context of PNLMS algorithm. This involves development of an equivalent 
transform domain model of the proposed algorithm first. A convergence analysis of the proposed algorithm is then carried out 
by applying to the equivalent model a scheme of angular discretization of continuous valued random vectors proposed first by 
Slock l22l and used later by several other researchers Il24ll . l25l . 


A. A Transform Domain Model of the Proposed Algorithm 

1 1 — 

The proposed equivalent model uses a diagonal ‘transform’ matrix Gi (n) with [Gi 0 *)]m = 9i ( n ), i = 0,1, • ■ • , L — 1, 
to transform the input vector x(n) and the filter coefficient vector w(n) to their ‘transformed’ versions, given respectively 
as s (n) = Gi(n)x(n) and wjv(n) = [G5(n)] _1 w(n). It is easy to check that w^(n)s (n) = w T (n)x(n) = y(n) (say), 
i.e., the filter wjv(ra) with input vector s (n) produces the same output y(n) as produced by w (n) with input vector x(n). To 
compute Gi (n + 1) and wAr(n + 1), the filter w n{h) is first updated to a weight vector w N (n + 1) as 


w Ar (n + l) = wjv(n)+ 


pe(n)s(n) 
s T (n)s(n) 


Sp 


- pG i(n)sgn(-w N (ri)). 


(ID 


From (9), it is easy to check that w(n+l) is given by w(n+l) = G 2 ( n)w N (n+l ). The matrix G(n+1) follows from w(n+l) 
following its definition and wAr(n+ 1) is then evaluated as w n(u+ 1) = [Gs(n + l)] -1 w(n +1). From above, it follows that 

wjv(n+l) = G~5( n +l)w(n+l) = G“5(n+l)Gi(n)w^(»t+l), meaning [wAr(n+l)]i = [ g fffi) ]^[w^(n+l)]j,i = 

0,1, • • • ,L — 1. Since 9i( n ) = 1 an£ l 9 < 9i( n ) < 1, i = 0,1, ■ • • ,L — 1, it is reasonable to expect that gfri) does not 

change significantly from index n to index (n + 1) [especially near convergence and/or for large order filters] and thus, we 
can make the approximation [<?i(n )] 3 [w N (n + l)]j « [gi(n + 1)] 5 [w^fre + 1)]*, which implies w N (n + 1) = w n(ti + 1). 









B. Angular Discretization of a Continuous Valued Random Vector ^221/ 

As per this, given a zero mean. Lx 1 random vector x with correlation matrix R = /t’[xx T ], it is assumed that x can assume 
only one of the 2 L orthogonal directions, given by ±e,, i = 0,1, • • • , L — 1, where e, is the z-th normalized eigenvector of 
R corresponding to the eigenvalue A,. In particular, x is modeled as x = sr v, where v G {c,|z = 0,1, • • • , L — 1}, with 
probability of v = e, given by pi, r = ||x||, i.e., r has the same distribution as that of ||x|| and s G {1, —1}, with probability 
of s = ±1 given by P(s = ±1) = 7. Further, the three elements s, r and v are assumed to be mutually independent. Note 
that as s is zero mean, E[srv} = 0 and thus E [x| = 0 is satisfied trivially. To satisfy E\xx r \ = R, the discrete probability 
Pi is taken as pi = which satisfies Pi > 0, Y^t=o Pi = 1 and leads to P[xx T ] = E(s 2 r 2 vv T ) = E(r 2 ) E(w T ) = 

Tr [R] Ylf=o Pi e i e T = ^i e i e I = R- Also note that if 9i be the angle between x and e^, then cos (0i) = and 

-E[cos 2 (0j)] ss meaning p t provides a measure of how far x is (angularly) from e t on an average. 

In our analysis of the proposed algorithm, we use the above model to represent the transformed input vector s (n) as 

s(n) = s s (n) r s {n) v s (n), (12) 

where, s s {n) G {+1,-1} with P(s s (n) = ±1) = |, r s (n) = ||s(n)|| and v s (n) G {e Sti (n)\i = 0,1, • • ■ , L — 1} with 
P(v s (n) = e S) i(n)) = ^y , where, S (n) = P[s(n)s T (n)], A s ,i{n) is the z-th eigenvalue of S(n), and as before, the three 

elements s s (n), r s (n) and v s (n) are mutually independent. 


C. Convergence of the ZA-PNLMS Algorithm in mean 

Now, defining the weight error vector at the n-th index as w(n) = w opt — w(n), the transform domain weight error vector 
Wjv(n) = G - 2 (n)w(n) = G _ 2 (n)w opt — wjv(n) and expressing e(n) = s T (n)w n( n) + v(n), the recursive form of the 
weight error vectors can then be obtained as 


w N (n + 1) 


w N(n) 


ps(n)s T (n)wjv(n) 
s T (n)s(n) + 5p 


ps(ri)v(n) 
fn)s(n) + Sp 


pG 2(n)sgn(-w N (n)). 


(13) 


For our analysis here, we approximate Sp by zero in (1 1 3t i as Sp is a very very small constant. The first order convergence of 
the ZA-PNLMS is then provided in the following theorem. 


Theorem 1 . With a zero-mean input x(n ) of covariance matrix R, the ZA-PNLMS algorithm produces stable wjv(?t) and 
also w (n) if the step-size p satisfies 0 < p < 2 and under this condition, wjv(n) and w(n) converge respectively as per the 
following: 


w w (00) = lim w N {n) = E (G * (n))\ w opt 

n—>00 1 00 


P_ 

E 


Tr( S(oo))S 1 (oo) lim E(G 2 (n)sgn(wjv(n))) 

n—> 00 


(14) 


and 


w(oo) 


lim w(n) = w opt — —Tr(S(oo)) x 

71—► OO p 


P(G 2 (n))| S 1 ( 00 ) lim E(G * (n)sgn(w(n)), 

1 00 n—>• 00 


where S(n) = E(s[n)s T (n)) = P(G 2 (n)RG 2 (n)). 


(15) 


Proof: For analysis, we now substitute 5 = 0 in (fT3l> as 5 is a very very small constant. Taking expectation of both sides 
of (fHl i and invoking the well known “independence assumption” that allows taking wjy(n) to be statistically independent of 
s(n), we then obtain, 

E(w N (n + l)) = E{w N (n)) ~ pE E{w N (n)) + pE(G-i{n)sgn(w N (n))) 

=> E(vf N (n+l)) = (I- pB(n))E(W N (n)) + pE(G~? {n)sgn(w N (n))) (16) 


( s(n)s T (n) \ 

Vs T («)s(n)7 ’ 


where 


B(n) = E 


(17) 








Note that B(n) is symmetric and therefore, one can have its eigendecomposition B(n) = E(n)D(n)E T (n) where E(n) = 
[e 0 (n) ei(n) • • • ei_i(n)], D(n) = diag[X 0 (n), Xi(n),--- , Al_i(ti)], with e^n) and A i(n), i = 0, 1, • • • , L — 1 denoting 
the z-th eigenvector and eigenvalue of B(n) respectively. The eigenvalues are real and the eigenvectors e, (n) are mutually 
orthonormal, meaning E(n) is unitary, i.e., E r (n)E(n) = E(?z)E 2 (n) = I. From B(n)e,;(n) = Ai(n)e,(n) and the fact that 
|| e j(n)|| 2 = 1, it is easy to observe that 

A i(n) = ef (n)B(n)ej(n) = E 

Two observations can be made now: 

1) A i(n) > 0 [ theoretically, one can have Aj(n) = 0 also, provided s T (n)ei(n) = 0, i.e., s(n) is orthogonal to e,(n) in 
each trial, which is ruled out here]. 

2) From Cauchy-Schwarz inequality, [s T (n)ei(n)] 2 < ||s(n)|| 2 ||ei(n)|| 2 = ||s(n)|| 2 , meaning A i(n) < 1. 

Pre-multiplying both sides of ( IT6l ) by E T (n), defining u(n) = E T (n).E(wjv(n)), v(n + 1) = E T (n)E(w~jv(n + 1)), z (n) = 
E t {n)E {G 2 (n)sgn(\V]\[(n))), substituting B(n) by E (n )D( n)E i (n) and using the unitariness of E(n), we have, 

v(n + 1) = (I — pD(n))u(n) + pz(n). (18) 

Taking norm on both sides of (fT8l> and invoking triangle inequality property of norm, i.e., a + b | j < ||a|| + ||b||, we then 
obtain, 

||v(n + l)|| < ||(I — /iD(?r))u(n)| | + p ||z(n)||. (19) 


[s T (n)e^(n)] 

ll s WH 2 


Since E(n) is unitary, we have ||v(n+ 1) 
||z(n + 1)|| = E G~^(n)sgn(w N (n)) 


= ||E(w JV (n + l))||, ||u(n^|| = ||E(w)v(n))|| and 

. Using the fact that {E[g u 2 (n)sg?r(wAr(n))]} 2 < E(g~ i 1 (n)) (i.e., using Cauchy 


Schwarz inequality and the fact that sgn 2 (.) = 1), we can write ||z(n + 1)|| < yE Lo E (9a 1 (n)) = c(n) (say). Clearly 
c(n) is finite, as G(n) is a diagonal matrix with only positive elements. From ( 1 1 9k one can then write. 


L—l 


\\E(vf N (n+ 1))|| < 


N 


5^(1 - pXi(n)) 2 \ui(n)\ 2 + pc(n). 
2—0 


( 20 ) 


We now select p so that |1 — p,Xi(n)\ < 1, or , equivalently, —1 < 1 — pXi(n) < 1, which leads to the following: 

1) pXi(n ) > 0, meaning p > 0 as A i(n) > 0 (as explained above). 

2) p < A . 2 w - ) (since A i[n) < 1 or, equivalently > 1, it will be sufficient to take p < 2 for satisfying this inequality). 

Therefore, for 0 < p < 2, we have |1 — pXi(n)\ < 1, where 0 < A i(n) < 1. Let ||£'(w]v(7i))|| = 0(n) and k(n) = 
max {|(1 — pXi(n))\, i = 0, 1, ■ • • , L — 1}, meaning 0 < k(n) < 1. From (f20t . one can then write. 


L-r 


r X/ + P c ( n ) = k{n)0(ri) + pc{n). 

\ z—0 


9(n + 1) < fc(n) A 

Proceeding recursively backwards till n = 0, 

n 

9{n + 1) < iim»- i ) 0(0) + p f c(n) + (n - *) ] < n - 1 ~!) ] ■ 


i—i / i 


i =0 


/—0 \ i —0 


( 21 ) 


( 22 ) 


Clearly, for 0 < k(n) < 1, the first term of RHS of d22l) vanishes as n approaches infinity. For the second term, c(n) is a bounded 
sequence, which, in steady state, can be taken to be time invariant, say c, as the variation of gu(n ) vs. n, i = 0,1, ■ • • , L 1 
in the steady state are negligible. Also note that n!-o — *) a decaying function of l since 0 < k(m) < 1 at any 
index to. From these, one can write lim ||E(w)v(n))|| = lim 0(n) < pK where I\ is a positive constant. Recalling that 

n—> oo n—>oo 


lim 

n—too 


E(Gl (n))E(w N (n)) 


< lim ||£’(w) v (n))|| < pK. 

n—too 


w(n) = G 2 (n)w)v(n), we can then write lim ||£’(w(n))|| 

n—)• oo 

Since p is very small, this implies that E(w(n)) will remain in close vicinity of w opt in the steady state under the condition: 
0 < p < 2. In other words, E(w(n)) will provide a biased estimate of w opt , though the bias, being proportional to p, will be 
negligibly small. 

Under the condition 0 < p < 2, letting n approach infinity on both the LHS and the RHS of ( IT6l> and noting that as n oo, 
E(Wn(ti + 1)) « E(w)v(n)), one can obtain from dT6b . 


lim E(wjv(n)) = —B 1 (oo) lim E(G = (n) x s^n(G2 (n)wjv(n))). 

n —»oo fj, n—too 


( 23 ) 






















Further, B(n) can be simplified in terms of S (n) by invoking the angular discretization model of a random vector as discussed 
in the section III.B. We replace s (n) by s s (n)r s (n)v s (n) as given by G3. One can then write. 


B(n) 


( s(n)s T (n) \ = / ^(n)v s (n)vj (n) \ 
\s T (n)s(n)) \r 2 s (n)vJ(n)v s (n)J 


E 


A s ,i( n ) 
Tr(S(n)) Gs ’ 


( n ) e li( n ) 


s (») 

Tr(S{n)Y 


(since s1(n) = 1) 


(24) 


since S(n) = A s ,i(ti)e S)i (n)e^(n). 

Letting n approach infinity in ( 124b and substituting this in ( 1231 . 


lim £'(w 7 v(n)) = — Tr(S(oo))S 1 (oo) lim E(G 2 (n) x sjnfG^njwjv^))). (25) 

n—f oo jl n—f oo 


Recalling w(n) = G 2 (n)wjv(n), from d25l > we have. 


lim E(xv(n)) 

n—f oo 


lim ,E(G 2 (n))E(ww(n)) 

n—f oo 

—TV(S(oo))i?(G T (n))| S _1 (oo) lim U(G~ 5 (n)sgn(w(n))). 

I_L 100 n—)• oo 


(26) 


Further, we have E(w]y(n)) = E (G 2 (n))w op j — S(wjv(n)) and E(xv(n)) = vf op t — E(w(n)), and thus, with (l25l > and 
(|26K this completes the proof. 


Corollary 1. For white input, Wi{ oo)(= 
given by 


lim E(wi(n))) for the i-th active tap (i.e., for which w op t,i Y OJ is approximately 

n—¥ oo ’ 


Wi(oo) = Wopt'i - -g i 1 (oo)sgn(w opt ,i) 

fj, 


(27) 


where gi(oo) = lim gi(n ) one/ g.j(n) = [£’(G(n))]j j. 

n—foo 

Proof: For white input with variance er^, we have R = a 2 1, S (n) = <j^i7(G(n)), TV(S(n)) = a 2 and S _1 (n) = 
-Y-E{G(n))~ 1 and then, we can have a simplified expression of w(oo) as 

® x 


w(oo) ~ w opt — — lim E(G(n)) 1 E(sgn(w(n))) 

fj, n—f oo 


(28) 


where we have assumed that in the steady state as n —» oo, G _ 5 (ti) and sgn(w{n )) become statistically independent and 
-E(G(n) -2 ) « i?(G(n)) _2 , which is reasonable as in the steady state, variance of each individual g.i(n),i = 0,1, ■ ,L — 1 

is quite small (i.e., it behaves almost like a constant). Now, for an active tap with significantly large magnitude w op t,i, it is 
reasonable to approximate sgn(wi(n)) ss sgn(w op t,i ) under the assumption that in the steady state, the variance of Wi(n), i.e., 
E((wi(n) — w op t,i ) 2 ) is small enough compared to the magnitude of w op t,i- Then, with E(sgti(iVi(n ))) « E(sgn(w op ty)) = 
sgn(w op t,i) for an active tap in the steady state, the result follows trivially from (f28l) . ■ 

Corollary 1 shows that 


{ w op t,i - —gt 1 (oo), iisgn(w opt ,i) > 0 

P—i 

w 0 pt,i + ~Qi (oo), if sgn(w opt ,i) < 0, 

which implies that Wi( oo) is always closer to the origin vis-a-vis w op t,i . Further, the bias (i.e., usually defined as w op t,i—Wi( oo)) 
is also proportional to gi~ 1 { oo), meaning active taps with comparatively smaller values will have larger bias and vice versa. 

In the case of inactive taps, we have w op t,i = 0. From (14) and for p = 0 (i.e., no zero attraction), this implies Wi( oo) = 0, 
i.e., the tap estimates fluctuate around zero value. For p > 0, the zero attractors come into play in the update equation (7) 
and act as an additional force that tries to pull the coefficients to zero from either side. The effect of zero attractor is thus to 
confine the fluctuations in a small band around zero. On an average, one can then take E(sgn(wi(n))) | « 0, meaning, from 

(16), the inactive tap estimates will largely be free of any bias. 
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Fig. 1. Impulse response of the sparse system 


IV. Numerical Simulations 

In this section, we investigate evolution of E(w(n)) of the proposed ZA-PNLMS algorithm with time via simulation studies 
in the context of sparse system identification. For this, we considered a sparse system with impulse responses of length L=512 
as shown in Fig. 1. The system has 37 active taps and is driven by a zero mean, white input x(n) of variance crj; = 1, with 
the output observation noise v{n ) being taken to be zero mean, white Gaussian with a p = 10 3 . The proposed ZA-PNLMS 
algorithm is used to identify the system, for which the step size //., the zero attracting coefficient p and the regularization 
parameter (to avoid division by zero) are taken to be 0.7, 0.0001 and 0.01 respectively, while p g and 6 are chosen as 0.01 
and 0.001 respectively. The simulations are carried out for a total of 25,000 iterations and for each tap weight Wi(n), the 
learning curve E[wi(n)] vs n is evaluated by averaging Wi(n) over 30 experiments. For demonstration here, we consider four 
representative learning curves, for 7=37, 55, 67, 1. (the corresponding w optl given by 0.9, 0.1,-0.05 and 0 respectively). These 
are shown in Figs. 2-5 respectively where it is seen that for both the inactive tap (i.e., w op tp ) and the active tap with relatively 
large magnitude (i.e., w op t^(n)), E[wi(n)\ converges to its optimum values of 0 and 0.9 respectively. On the other hand, for 
w op t, 67 (n) and w op t, 55 (n), i.e., for active taps with relatively less magnitudes, E[wi(n)] converges with reasonably large bias. 
This validates our conjectures made in section III (Corollary 1 and the subsequent analysis). To validate the same further, the 
bias is calculated from the learning curves (in the steady state) for all the taps and then plotted in Fig. 6 against the magnitude 
of the optimum tap weights. Clearly, the bias becomes negligible as the magnitude of the active tap increases. 
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